Seeing Bird Sounds

How looking at spectrograms can help us hear more
Dan Brown
GGAS Master Birder Student Salon
2019-11-15

What is a spectrogram?

  • Light to dark — loudness
  • Left to right — time
  • Low to high — pitch
In [ ]:
_spectro_scale = 5
_selected_recs = {
    243973: ('GCSP', 'descending, high-pitched whistles'),
    172196: ('GHOW', 'low-pitched hoots'),
    154721: ('FOSP', 'short chips, variable tempo'),
    408021: ('WREN', 'short "chirps", increasing tempo'),
    252944: ('STJA', 'harsh calls (many frequencies)'),
    110976: ('CASJ', 'harsh calls, upward pitch'),
    188039: ('WCSP', 'whistle → burry → downslurs'),
    # 174545: ('SOSP', '"tink" notes → fast trill → burry whistle'),
}
(search_recs
    .pipe(df_inspect, lambda df: (df.shape,))
    # Filter
    [lambda df: reduce(lambda x, y: x | y, [
        *[
            (df.species == species) & (df.xc_id == xc_id)
            for xc_id, (species, _) in _selected_recs.items()
        ],
        # df.species == 'WREN',  # XXX For explore
    ])]
    .pipe(df_inspect, lambda df: (df.shape,))
    # Sort
    # .sample(frac=1, random_state=0)  # XXX For explore
    .assign(_sortby=lambda df: df.xc_id.pipe(lambda s: as_ordered_cat(s, _selected_recs.keys())))
    .sort_values('_sortby')
    # .sort_values('xc_id')  # XXX For explore
    # View
    .assign(_talk_notes=lambda df: df.xc_id.map(lambda xc_id: '' if xc_id not in _selected_recs else _selected_recs[xc_id][-1]))
    .pipe(talk_recs_show,
        scale=_spectro_scale,
        append=['_talk_notes'],
    )
    [:100]
    .pipe(add_recs_to_cite)
    [[
        # 'xc_id', 'species',  # XXX For explore
        'species_com_name',
        'spectro_disp',
        '_talk_notes',
    ]]

    # The normal, simple thing
    # .pipe(talk_hide_index_and_columns)

    # HACK Un-kill td padding for not spectros
    # HACK Crop spectro img widths (doesn't change audio)
    .pipe(lambda df: display_with_style(
        talk_hide_index_and_columns(df),
        style_css=lambda scoped_class: f'''
            /* HACK Un-kill td padding for not spectros */
            .{scoped_class} .dataframe tbody tr td:nth-child(2),
            .{scoped_class} .dataframe tbody tr td:nth-child(4) {{
                padding: 0.2em 0.5em 0.2em 0.5em !important;
            }}
            /* HACK Crop spectro img widths (doesn't change audio) */
            .{scoped_class} .bubo-audio-container {{
                width: {4/10 * _spectro_scale * spectro_img_width_px}px;
                overflow: hidden;
            }}
        ''',
    ))

)
(52613, 71)
(7, 71)
Out[ ]:
species_com_name spectro_disp _talk_notes
0
Golden-crowned Sparrow
descending, high-pitched whistles
1
Great Horned Owl
low-pitched hoots
2
Fox Sparrow
short chips, variable tempo
3
Wrentit
short "chirps", increasing tempo
4
Steller's Jay
harsh calls (many frequencies)
5
California Scrub-Jay
harsh calls, upward pitch
6
White-crowned Sparrow
whistle → burry → downslurs

What does a Lesser Goldfinch sound like?

image

What does a Lesser Goldfinch sound like?

  • What do we see?
In [ ]:
_spectro_scale = 5
(search_recs
    .pipe(df_inspect, lambda df: (df.shape,))
    # Filter
    [lambda df: reduce(lambda x, y: x & y, [
        df.species == 'LEGO',
        df.quality.isin(['A', 'B']),
    ])]
    .pipe(df_inspect, lambda df: (df.shape,))
    # Sort
    .sample(frac=1, random_state=2)
    # View
    .pipe(talk_recs_show,
        scale=_spectro_scale,
    )
    [:50]
    .pipe(add_recs_to_cite)
    [['spectro_disp']]
    .pipe(talk_hide_index_and_columns)
)
(52613, 71)
(110, 71)
Out[ ]:
spectro_disp
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49

Let's learn a few birds

Pacific Slope Flycacher

  • What do we see?
In [ ]:
_spectro_scale = 5
(search_recs
    .pipe(df_inspect, lambda df: (df.shape,))
    # Filter
    [lambda df: reduce(lambda x, y: x & y, [
        df.species == 'PSFL',
        df.quality.isin(['A']),
    ])]
    .pipe(df_inspect, lambda df: (df.shape,))
    # Sort
    .sample(frac=1, random_state=0)
    # View
    .pipe(talk_recs_show,
        scale=_spectro_scale,
    )
    [:50]
    .pipe(add_recs_to_cite)
    [['spectro_disp']]
    .pipe(talk_hide_index_and_columns)
)
(52613, 71)
(67, 71)
Out[ ]:
spectro_disp
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49

Hutton's Vireo

  • What do we see?
In [ ]:
_spectro_scale = 5
(search_recs
    .pipe(df_inspect, lambda df: (df.shape,))
    # Filter
    [lambda df: reduce(lambda x, y: x & y, [
        df.species == 'HUVI',
        df.quality.isin(['A', 'B']),
    ])]
    .pipe(df_inspect, lambda df: (df.shape,))
    # Sort
    .sample(frac=1, random_state=2)
    # View
    .pipe(talk_recs_show,
        scale=_spectro_scale,
    )
    [:50]  # (148 recs in quality=A,B)
    .pipe(add_recs_to_cite)
    [['spectro_disp']]
    .pipe(talk_hide_index_and_columns)
)
(52613, 71)
(148, 71)
Out[ ]:
spectro_disp
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49

Bewick's Wren (a harder example)

  • What do we see?
In [ ]:
_spectro_scale = 5
(search_recs
    .pipe(df_inspect, lambda df: (df.shape,))
    # Filter
    [lambda df: reduce(lambda x, y: x & y, [
        df.species == 'BEWR',
        df.quality.isin(['A']),
    ])]
    .pipe(df_inspect, lambda df: (df.shape,))
    # Sort
    .sample(frac=1, random_state=1)
    # View
    .pipe(talk_recs_show,
        scale=_spectro_scale,
    )
    [:50]
    .pipe(add_recs_to_cite)
    [['spectro_disp']]
    .pipe(talk_hide_index_and_columns)
)
(52613, 71)
(168, 71)
Out[ ]:
spectro_disp
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49

Let's dig a little deeper

Spotted Towhee: “west coast” vs. “interior”

  • What do we see?
In [ ]:
_spectro_scale = 4
(search_recs
    .pipe(df_inspect, lambda df: (df.shape,))
    # Filter
    [lambda df: reduce(lambda x, y: x & y, [
        df.species == 'SPTO',
        # df.type.str.lower().str.match(r'.*song.*'),
        # df.type.str.lower().str.match(r'^song$'),
        df.quality.isin(['A', 'B']),
    ])]
    .pipe(df_inspect, lambda df: (df.shape,))
    # Sort
    # .sample(250, random_state=0)
    # .sample(50, random_state=0)
    # .sort_values(['subspecies', 'lat'], ascending=False)
    # .sort_values(['lat'], ascending=False)
    # .sort_values(['lng'], ascending=True)
    .pipe(df_inspect, lambda df: (df.shape,))
    # View
    .pipe(talk_recs_show,
        scale=_spectro_scale,
    )

    # Bin by geo (2 bins)
    #   - Very rough, but decent results
    .assign(geo=lambda df: df.pipe(df_map_rows, lambda row: (
        'West of -117°: no intro notes → fast/slow trill' if row.lng < -117 else  # -117 to exclude Inyo County (Death Valley) which behaves ~like interior
        'East of -110°: intro notes → slow trill'         if row.lng > -110 else  # -110 to include CO/AZ/MX (at least)
        None                                                                      # Throw out everything in the middle, to simplify analysis
    )))
    [lambda df: df.geo.notnull()]
    .assign(geo=lambda df: df.geo.pipe(lambda s: s.pipe(as_ordered_cat, [
        'West of -117°: no intro notes → fast/slow trill',
        'East of -110°: intro notes → slow trill',
    ])))
    # .pipe(df_inspect, lambda df: df[:250])  # Debug

    # Pivot by geo (manually)
    .pipe(lambda df: pd.concat(axis=1, objs=[
        (df
            [df.geo == geo]
            .rename(columns={'spectro_disp': geo})
            .sample(frac=1, random_state=0)  # Randomize per column (to avoid weird biases from incidental sorting above)
            .reset_index(drop=True)
            [:25]
            .pipe(add_recs_to_cite)
            [[geo]]
        )
        for geo in df.geo.sort_values().unique()
    ]))
    .fillna('')
    .pipe(df_inspect, lambda df: (df.shape,))

    # The normal, simple thing
    # .pipe(talk_hide_index)

    # HACK Crop spectro img widths (doesn't change audio)
    .pipe(lambda df: display_with_style(
        talk_hide_index(df),
        style_css=lambda scoped_class: f'''
            /* HACK Crop spectro img widths (doesn't change audio) */
            .{scoped_class} .bubo-audio-container {{
                width: {6/10 * _spectro_scale * spectro_img_width_px}px;
                overflow: hidden;
            }}
        ''',
    ))

)
(52613, 71)
(432, 71)
(432, 71)
(25, 2)
Out[ ]:
West of -117°: no intro notes → fast/slow trill East of -110°: intro notes → slow trill
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

What goes “wraa”?

In [ ]:
# 1
_spectro_scale = 4
_cmap_name = (
    'Pastel1'
    # 'Pastel2'
    # 'Set3'
)
_selected_recs = {
    14843:  ('SPTO', '"wraa" / mew'),
    127012: ('SPTO', '"wraa" / mew'),
}
_colors = [mpl.colors.rgb2hex(x) for x in getattr(mpl.cm, _cmap_name).colors]
(search_recs
    .pipe(df_inspect, lambda df: (df.shape,))
    # Filter
    [lambda df: reduce(lambda x, y: x | y, [
        *[
            (df.species == species) & (df.xc_id == xc_id)
            for xc_id, (species, _) in _selected_recs.items()
        ],
        # df.species == 'HUVI',  # XXX For explore
    ])]
    .pipe(df_inspect, lambda df: (df.shape,))
    # Sort
    .assign(_sortby=lambda df: df.xc_id.pipe(lambda s: as_ordered_cat(s, _selected_recs.keys())))
    .sort_values('_sortby')
    # .sort_values('xc_id')  # XXX For explore
    # .sample(frac=1, random_state=0)  # XXX For explore

    # View
    .assign(_talk_notes=lambda df: df.xc_id.map(lambda xc_id: '' if xc_id not in _selected_recs else _selected_recs[xc_id][-1]))
    .pipe(talk_recs_show,
        scale=_spectro_scale,
        append=['_talk_notes'],
    )
    [:100]
    .pipe(add_recs_to_cite)
    [[
        # 'xc_id', 'species',  # XXX For explore
        'species_com_name',
        '_talk_notes',
        'spectro_disp',
    ]]

    # The normal, simple thing
    # .pipe(talk_hide_index_and_columns)

    # HACK Un-kill td padding for not spectros
    # HACK Color rows by species
    .assign(_color=lambda df: df.species_com_name.pipe(lambda s: (
        s.map({x: _colors[i] for i, x in enumerate(unique_everseen(s.tolist()))}.__getitem__)
    )))
    .pipe(lambda df: display_with_style(
        talk_hide_index_and_columns(df
            .drop(columns=['_color'])
        ),
        style_css=lambda scoped_class: '\n'.join([
            # HACK Restrict background to first two cols (after index col)
            f'''
                .{scoped_class} tbody tr:nth-child({i + 1}) td:nth-child(2),
                .{scoped_class} tbody tr:nth-child({i + 1}) td:nth-child(3) {{
                    padding: 0.2em 0.5em 0.2em 0.5em !important; /* HACK Un-kill td padding for not spectros */
                    background: {_color} !important; /* HACK Color rows by species *//
                }}
            '''
            for i, _color in enumerate(df._color.tolist())
        ]),
    ))

)
(52613, 71)
(2, 71)
Out[ ]:
species_com_name _talk_notes spectro_disp
0
Spotted Towhee
"wraa" / mew
1
Spotted Towhee
"wraa" / mew

What goes “wraa”?

In [ ]:
# 2
_spectro_scale = 4
_cmap_name = (
    'Pastel1'
    # 'Pastel2'
    # 'Set3'
)
_selected_recs = {
    14843:  ('SPTO', '"wraa" / mew'),
    127012: ('SPTO', '"wraa" / mew'),
    297119: ('HUVI', '"wraa"'),
    297120: ('HUVI', '"wraa"'),
    # 348986: ('HUVI', '"wraa"'),
}
_colors = [mpl.colors.rgb2hex(x) for x in getattr(mpl.cm, _cmap_name).colors]
(search_recs
    .pipe(df_inspect, lambda df: (df.shape,))
    # Filter
    [lambda df: reduce(lambda x, y: x | y, [
        *[
            (df.species == species) & (df.xc_id == xc_id)
            for xc_id, (species, _) in _selected_recs.items()
        ],
        # df.species == 'HUVI',  # XXX For explore
    ])]
    .pipe(df_inspect, lambda df: (df.shape,))
    # Sort
    .assign(_sortby=lambda df: df.xc_id.pipe(lambda s: as_ordered_cat(s, _selected_recs.keys())))
    .sort_values('_sortby')
    # .sort_values('xc_id')  # XXX For explore
    # .sample(frac=1, random_state=0)  # XXX For explore

    # View
    .assign(_talk_notes=lambda df: df.xc_id.map(lambda xc_id: '' if xc_id not in _selected_recs else _selected_recs[xc_id][-1]))
    .pipe(talk_recs_show,
        scale=_spectro_scale,
        append=['_talk_notes'],
    )
    [:100]
    .pipe(add_recs_to_cite)
    [[
        # 'xc_id', 'species',  # XXX For explore
        'species_com_name',
        '_talk_notes',
        'spectro_disp',
    ]]

    # The normal, simple thing
    # .pipe(talk_hide_index_and_columns)

    # HACK Un-kill td padding for not spectros
    # HACK Color rows by species
    .assign(_color=lambda df: df.species_com_name.pipe(lambda s: (
        s.map({x: _colors[i] for i, x in enumerate(unique_everseen(s.tolist()))}.__getitem__)
    )))
    .pipe(lambda df: display_with_style(
        talk_hide_index_and_columns(df
            .drop(columns=['_color'])
        ),
        style_css=lambda scoped_class: '\n'.join([
            # HACK Restrict background to first two cols (after index col)
            f'''
                .{scoped_class} tbody tr:nth-child({i + 1}) td:nth-child(2),
                .{scoped_class} tbody tr:nth-child({i + 1}) td:nth-child(3) {{
                    padding: 0.2em 0.5em 0.2em 0.5em !important; /* HACK Un-kill td padding for not spectros */
                    background: {_color} !important; /* HACK Color rows by species *//
                }}
            '''
            for i, _color in enumerate(df._color.tolist())
        ]),
    ))

)
(52613, 71)
(4, 71)
Out[ ]:
species_com_name _talk_notes spectro_disp
0
Spotted Towhee
"wraa" / mew
1
Spotted Towhee
"wraa" / mew
2
Hutton's Vireo
"wraa"
3
Hutton's Vireo
"wraa"

What goes “wraa”?

In [ ]:
# 3
_spectro_scale = 4
_cmap_name = (
    'Pastel1'
    # 'Pastel2'
    # 'Set3'
)
_selected_recs = {
    14843:  ('SPTO', '"wraa" / mew'),
    127012: ('SPTO', '"wraa" / mew'),
    297119: ('HUVI', '"wraa"'),
    297120: ('HUVI', '"wraa"'),
    # 348986: ('HUVI', '"wraa"'),
    268597: ('HETH', 'hoarse wheeze'),
    13677:  ('HETH', 'hoarse wheeze'),
    # 131636: ('HETH', 'hoarse wheeze'),
}
_colors = [mpl.colors.rgb2hex(x) for x in getattr(mpl.cm, _cmap_name).colors]
(search_recs
    .pipe(df_inspect, lambda df: (df.shape,))
    # Filter
    [lambda df: reduce(lambda x, y: x | y, [
        *[
            (df.species == species) & (df.xc_id == xc_id)
            for xc_id, (species, _) in _selected_recs.items()
        ],
        # df.species == 'HUVI',  # XXX For explore
    ])]
    .pipe(df_inspect, lambda df: (df.shape,))
    # Sort
    .assign(_sortby=lambda df: df.xc_id.pipe(lambda s: as_ordered_cat(s, _selected_recs.keys())))
    .sort_values('_sortby')
    # .sort_values('xc_id')  # XXX For explore
    # .sample(frac=1, random_state=0)  # XXX For explore

    # View
    .assign(_talk_notes=lambda df: df.xc_id.map(lambda xc_id: '' if xc_id not in _selected_recs else _selected_recs[xc_id][-1]))
    .pipe(talk_recs_show,
        scale=_spectro_scale,
        append=['_talk_notes'],
    )
    [:100]
    .pipe(add_recs_to_cite)
    [[
        # 'xc_id', 'species',  # XXX For explore
        'species_com_name',
        '_talk_notes',
        'spectro_disp',
    ]]

    # The normal, simple thing
    # .pipe(talk_hide_index_and_columns)

    # HACK Un-kill td padding for not spectros
    # HACK Color rows by species
    .assign(_color=lambda df: df.species_com_name.pipe(lambda s: (
        s.map({x: _colors[i] for i, x in enumerate(unique_everseen(s.tolist()))}.__getitem__)
    )))
    .pipe(lambda df: display_with_style(
        talk_hide_index_and_columns(df
            .drop(columns=['_color'])
        ),
        style_css=lambda scoped_class: '\n'.join([
            # HACK Restrict background to first two cols (after index col)
            f'''
                .{scoped_class} tbody tr:nth-child({i + 1}) td:nth-child(2),
                .{scoped_class} tbody tr:nth-child({i + 1}) td:nth-child(3) {{
                    padding: 0.2em 0.5em 0.2em 0.5em !important; /* HACK Un-kill td padding for not spectros */
                    background: {_color} !important; /* HACK Color rows by species *//
                }}
            '''
            for i, _color in enumerate(df._color.tolist())
        ]),
    ))

)
(52613, 71)
(6, 71)
Out[ ]:
species_com_name _talk_notes spectro_disp
0
Spotted Towhee
"wraa" / mew
1
Spotted Towhee
"wraa" / mew
2
Hutton's Vireo
"wraa"
3
Hutton's Vireo
"wraa"
4
Hermit Thrush
hoarse wheeze
5
Hermit Thrush
hoarse wheeze

What goes “wraa”?

In [ ]:
# 4
_spectro_scale = 4
_cmap_name = (
    'Pastel1'
    # 'Pastel2'
    # 'Set3'
)
_selected_recs = {
    14843:  ('SPTO', '"wraa" / mew'),
    127012: ('SPTO', '"wraa" / mew'),
    297119: ('HUVI', '"wraa"'),
    297120: ('HUVI', '"wraa"'),
    # 348986: ('HUVI', '"wraa"'),
    268597: ('HETH', 'hoarse wheeze'),
    13677:  ('HETH', 'hoarse wheeze'),
    # 131636: ('HETH', 'hoarse wheeze'),
    159366: ('WAVI', '"wraa"'),
    381527: ('WAVI', '"wraa"'),
    141349: ('BEWR', 'harsh buzz'),
    163209: ('BEWR', 'harsh buzz'),
    265810: ('HOWR', 'harsh buzz #2'),
    376229: ('BGGN', 'polyphonic wheeze'),
    81059:  ('BGGN', 'polyphonic wheeze'),
}
_colors = [mpl.colors.rgb2hex(x) for x in getattr(mpl.cm, _cmap_name).colors]
(search_recs
    .pipe(df_inspect, lambda df: (df.shape,))
    # Filter
    [lambda df: reduce(lambda x, y: x | y, [
        *[
            (df.species == species) & (df.xc_id == xc_id)
            for xc_id, (species, _) in _selected_recs.items()
        ],
        # df.species == 'HUVI',  # XXX For explore
    ])]
    .pipe(df_inspect, lambda df: (df.shape,))
    # Sort
    .assign(_sortby=lambda df: df.xc_id.pipe(lambda s: as_ordered_cat(s, _selected_recs.keys())))
    .sort_values('_sortby')
    # .sort_values('xc_id')  # XXX For explore
    # .sample(frac=1, random_state=0)  # XXX For explore

    # View
    .assign(_talk_notes=lambda df: df.xc_id.map(lambda xc_id: '' if xc_id not in _selected_recs else _selected_recs[xc_id][-1]))
    .pipe(talk_recs_show,
        scale=_spectro_scale,
        append=['_talk_notes'],
    )
    [:100]
    .pipe(add_recs_to_cite)
    [[
        # 'xc_id', 'species',  # XXX For explore
        'species_com_name',
        '_talk_notes',
        'spectro_disp',
    ]]

    # The normal, simple thing
    # .pipe(talk_hide_index_and_columns)

    # HACK Un-kill td padding for not spectros
    # HACK Color rows by species
    .assign(_color=lambda df: df.species_com_name.pipe(lambda s: (
        s.map({x: _colors[i] for i, x in enumerate(unique_everseen(s.tolist()))}.__getitem__)
    )))
    .pipe(lambda df: display_with_style(
        talk_hide_index_and_columns(df
            .drop(columns=['_color'])
        ),
        style_css=lambda scoped_class: '\n'.join([
            # HACK Restrict background to first two cols (after index col)
            f'''
                .{scoped_class} tbody tr:nth-child({i + 1}) td:nth-child(2),
                .{scoped_class} tbody tr:nth-child({i + 1}) td:nth-child(3) {{
                    padding: 0.2em 0.5em 0.2em 0.5em !important; /* HACK Un-kill td padding for not spectros */
                    background: {_color} !important; /* HACK Color rows by species *//
                }}
            '''
            for i, _color in enumerate(df._color.tolist())
        ]),
    ))

)
(52613, 71)
(13, 71)
Out[ ]:
species_com_name _talk_notes spectro_disp
0
Spotted Towhee
"wraa" / mew
1
Spotted Towhee
"wraa" / mew
2
Hutton's Vireo
"wraa"
3
Hutton's Vireo
"wraa"
4
Hermit Thrush
hoarse wheeze
5
Hermit Thrush
hoarse wheeze
6
Warbling Vireo
"wraa"
7
Warbling Vireo
"wraa"
8
Bewick's Wren
harsh buzz
9
Bewick's Wren
harsh buzz
10
House Wren
harsh buzz #2
11
Blue-gray Gnatcatcher
polyphonic wheeze
12
Blue-gray Gnatcatcher
polyphonic wheeze

Thanks!

  • Questions?
In [ ]:
# Credits
HTML('''
    <div class="talk-credits">
        %(children)s
    </div>
''' % dict(
    children=(recs_to_cite
        .pipe(df_map_rows, lambda row: '''
            <div class="talk-credits-credit">
                <a href="https://www.xeno-canto.org/%(xc_id)s">%(species_com_name)s: XC%(xc_id)s © %(recordist)s</a>
            </div>
        ''' % row)
        .pipe(lambda s: '\n'.join(s.tolist()))
    ),
))
Out[ ]: